LIFEGUARD: Locating Internet Failure Events and Generating Usable Alternate Routes Dynamically

نویسنده

  • Colin Scott
چکیده

The Internet is far from achieving the “five 9s” (99.999%) of availability provided by the public telephone network. Based on anecdotal evidence as well as a first-of-its-kind measurement study assessing routing failures affecting Amazon’s EC2 data centers, we find that network outages are surprisingly common and highly problematic. These outages incur large costs, disrupt critical services and applications, and are generally difficult to detect and isolate. In this thesis we argue that it is possible to isolate wide-area network faults at a routerlevel granularity without requiring control over both end-points. We present LIFEGUARD, our prototype system for Locating Internet Failure Events and Generating Usable Alternate Routes Dynamically. LIFEGUARD leverages distributed network measurements, a continually updated historical path atlas, and novel measurement tools such as spoofed forward traceroute to infer the direction of failures, isolate faults, suggest alternate working paths, and provide views into routing behavior from before the onset of outages. We evaluate LIFEGUARD with a four month study monitoring real outages on the Internet, and present specific examples and aggregate characteristics of the failures we observed. Presentation of work given on June 2nd, 2011 Thesis and presentation approved by:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Systems for Improving Internet Availability and Performance

Systems for Improving Internet Availability and Performance Ethan B. Katz-Bassett Co-Chairs of the Supervisory Committee: Professor Thomas E. Anderson Department of Computer Science and Engineering Associate Professor Arvind Krishnamurthy Department of Computer Science and Engineering The Internet’s role in our lives continues to grow, but it often fails to provide the availability and performa...

متن کامل

Inter-domain collaborative routing (IDCR): Server selection for optimal client performance

Communication between institutions, or domains, residing in the Internet requires a route to be created between the routing domains. Each of these domains is controlled by a single administrative authority, and is referred to as an Autonomous System (AS). Control of routes that move the data in the Internet between ASes is problematic. If an AS requires certain route characteristics beyond the ...

متن کامل

Lifeguard : SWIM-ing with Situational Awareness

SWIM is a peer-to-peer group membership protocol with attractive scaling and robustness properties. However, slow message processing can cause SWIM to mark healthy members as failed (so called false positive failure detection), despite inclusion of a mechanism to avoid this. We identify the properties of SWIM that lead to the problem, and propose Lifeguard, a set of extensions to SWIM which con...

متن کامل

Autonomous Emergency Landing of a Helicopter: Motion Planning with Hard Time-Constraints

Engine malfunctions during helicopter flight poses a large risk to pilot and crew. Without a quick and coordinated reaction, such situations lead to a complete loss of control. An autonomous landing system is capable of reacting quickly to regain control, however current emergency landing methods focus only on the offline generation of dynamically feasible trajectories while ignoring the more s...

متن کامل

Node Sensing & Dynamic Discovering Routes for Wireless Sensor Networks

The applications of Wireless Sensor Networks (WSN) contain a wide variety of scenarios. In most of them, the network is composed of a significant number of nodes deployed in an extensive area in which not all nodes are directly connected. Then, the data exchange is supported by multihop communications. Routing protocols are in charge of discovering and maintaining the routes in the network. How...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011